Competing against the Best Nearest Neighbor Filter in Regression
نویسندگان
چکیده
Designing statistical procedures that are provably almost as accurate as the best one in a given family is one of central topics in statistics and learning theory. Oracle inequalities offer then a convenient theoretical framework for evaluating different strategies, which can be roughly classified into two classes: selection and aggregation strategies. The ultimate goal is to design strategies satisfying oracle inequalities with leading constant one and rate-optimal residual term. In many recent papers, this problem is addressed in the case where the aim is to beat the best procedure from a given family of linear smoothers. However, the theory developed so far either does not cover the important case of nearest-neighbor smoothers or provides a suboptimal oracle inequality with a leading constant considerably larger than one. In this paper, we prove a new oracle inequality with leading constant one that is valid under a general assumption on linear smoothers allowing, for instance, to compete against the best nearest-neighbor filters.
منابع مشابه
Liquid-liquid equilibrium data prediction using large margin nearest neighbor
Guanidine hydrochloride has been widely used in the initial recovery steps of active protein from the inclusion bodies in aqueous two-phase system (ATPS). The knowledge of the guanidine hydrochloride effects on the liquid-liquid equilibrium (LLE) phase diagram behavior is still inadequate and no comprehensive theory exists for the prediction of the experimental trends. Therefore the effect the ...
متن کاملModelling Climatic Parameters Affecting the Annual Yield of Rheum Ribes Rangeland Species using Data Mining Algorithms
Identification of climatic characteristics affecting the annual yield of Rheum Ribes can be useful in management and development of this species in the rangelands. In this research, the annual yield of this species in Khorasan-Razavi province based on 74 climatic parameters during a ten-year period evaluated and affecting climatic parameters extracted using data mining methods. First, the role ...
متن کاملAdaptive Kernel Metric Nearest Neighbor Classification
Nearest neighbor classification assumes locally constant class conditional probabilities. This assumption becomes invalid in high dimensions due to the curse-ofdimensionality. Severe bias can be introduced under these conditions when using the nearest neighbor rule. We propose an adaptive nearest neighbor classification method to try to minimize bias. We use quasiconformal transformed kernels t...
متن کاملUsing the SAS® System and SAS® Enterprise MinerTM for Data Mining: A study of Cancer Survival at Mayo Clinic
This paper evaluates and predicts a certain epidemiological (cancer survival) condition using data-mining techniques in SAS®. A data set that contains information about the survival of lung-cancer patients from a study at the Mayo Clinic was extracted from the R survival package. Data-mining techniques—namely linear and logistic regression models, regression and classification trees, and neares...
متن کاملEFFECT OF THE NEXT-NEAREST NEIGHBOR INTERACTION ON THE ORDER-DISORDER PHASE TRANSITION
In this work, one and two-dimensional lattices are studied theoretically by a statistical mechanical approach. The nearest and next-nearest neighbor interactions are both taken into account, and the approximate thermodynamic properties of the lattices are calculated. The results of our calculations show that: (1) even though the next-nearest neighbor interaction may have an insignificant ef...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011